organizing recurrent network
Supplementary Material: Organizing recurrent network dynamics by task-computation to enable continual learning
Instead of projecting out previously encountered inputs only (as in OWM), our proposed learning rule modifies both sides of the gradient update. We compare modifications on either side of the gradient update to demonstrate that a double-sided modification reduces forgetting. An alternative interpretation of the action of the projection matrices is in terms of slowing down the learning rate along previously explored directions in network-activity space. Our learning algorithm hence implements an adaptive learning rate schedule dependent on the total variance of activity along input/output directions on previous tasks. We used 64 trials per minibatch during training.
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (2 more...)
Review for NeurIPS paper: Organizing recurrent network dynamics by task-computation to enable continual learning
Summary and Contributions: This manuscript addresses the problem of continual learning in RNN. The authors propose a new learning rule that allows to organize the dynamics for different tasks into orthogonal subspaces. Using a set of neuroscience tasks, they show how this learning rule allows to avoid catastrophic interferences between tasks. By analyzing the dynamics of trained networks they provide evidence for why their learning rule is successful, it also allows them to discuss the problem of transfer learning. Strengths: - propose a new original solution to the problem of continual learning, which also allows them to address and understand under which conditions learning in one task can be transfered to learning off another task.
- Education (0.69)
- Health & Medicine > Therapeutic Area > Neurology (0.50)
Review for NeurIPS paper: Organizing recurrent network dynamics by task-computation to enable continual learning
The reviewers generally agree that this paper offers a novel viewpoint on avoiding catastrophic forgetting. The theoretical and experimental results are well received. R3 would have preferred to see a deeper discussion on the differences with OWM. However, the authors explained during the rebuttal that their learning rule modifies both sides of the gradient update, differently to OWM. This characteristic, together with the intricacies involved in considering a sequential application, makes the overall contribution significant enough.
Organizing recurrent network dynamics by task-computation to enable continual learning
Biological systems face dynamic environments that require continual learning. It is not well understood how these systems balance the tension between flexibility for learning and robustness for memory of previous behaviors. Continual learning without catastrophic interference also remains a challenging problem in machine learning. Here, we develop a novel learning rule designed to minimize interference between sequentially learned tasks in recurrent networks. Our learning rule preserves network dynamics within activity-defined subspaces used for previously learned tasks. It encourages dynamics associated with new tasks that might otherwise interfere to instead explore orthogonal subspaces, and it allows for reuse of previously established dynamical motifs where possible.